Cost drivers associated with diffuse large B-cell lymphoma (DLBCL) in Japan: A structural equation model (SEM) analysis

Diffuse large B-cell lymphoma (DLBCL) is an aggressive non-Hodgkin’s lymphoma of increasing prevalence in Japan. However, patients with relapsed or refractory disease to first line treatment (rrDLBCL) have been found to shoulder greater economic burden and have poor survival with subsequent lines of therapy. The relative impact of individual patient attributes on total medical cost among patients with rrDLBCL receiving second or third line (2L/3L) therapy was assessed. Structural equation modelling was used to identify potential cost drivers of total medical costs incurred by treatment and procedures in a Japanese retrospective claims database. From the database, rrDLBCL patients on 2L or 3L of treatment were grouped into respective cohorts. The mean [median] (SD) total medical cost of care for the 2L cohort was 73,296.40 [58,223.11] (58,409.79) US dollars (USD) and 75,238.35 [60,477.31] (59,583.66) USD for the 3L cohort. The largest total effect on medical cost in both cohorts was length of hospital stay (LOS) (β: 0.750 [95%CI: 0.728, 0.772] vs β: 0.762 [95%CI: 0.729, 0.794]). Length of hospital stay and potential heart disease complications due to line of treatment were the primary drivers of total cost for patients who had received at least 2L or 3L therapy for rrDLBCL.


Introduction
The incidence of aggressive non-Hodgkin lymphoma (NHL) has been increasing steadily in Japan. By 2008, NHL was responsible for 39.6% of all hematologic malignancies nationwide [1]. Diffuse large B-cell lymphoma (DLBCL) accounts for a large proportion of such lymphoid neoplasms in Japan (35.8%) and regional disease proportion varies between 25.7% to 39.5% [2]. The standard treatment for DLBCL is R-CHOP regimen (rituximab [R] + cyclophosphamide/doxorubicin/vincristine/prednisone) administered for 6-8 cycles. A United States (US) claims-based study found that 87.7% of DLBCL patients received combination therapies, and 69.7% had received R-CHOP [3]. A population-based cancer registry in Japan reported the 5-year overall survival for DLBCL patients to be 57% in 2003 1997 [4]. However, after stopping treatment, up to 50% of patients may become relapse or become refractory to further treatment [5]. Patients with relapsed or refractory (rrDLBCL) have poor prognosis and unstandardized treatment regimens during subsequent treatment lines [6]. A large study of pooled patient level data in the US demonstrated rrDLBCL patients had an overall response rate of 26% to further treatment and a median survival of 6.3 months [7]. Even after autologous stem cell transplantation (auto-SCT), median OS for rrDLBCL patients was 9.9 months [8]. While it is critical to understand not only the real-world course of treatment, but also the drivers of those medical costs for patients, there is a paucity of research on the economic burden of rrDLBCL in Japan.
Even with poor survival outcomes, the economic burden of DLBCL is high. The average DLBCL-related cost per patient per year in the first year of treatment was reported to be significantly higher for second line (2L) DLBCL patients (210,488 US dollars (USD)) compared to first line (1L) patients (25,044 USD) in the US [9]. A separate analysis of the economic burden for matched 1L and 2L DLBCL cohorts in the US highlighted clinical services as the main incremental cost drivers (outpatient (50%) and inpatient (36%) services) [10]. The relationship between the direct and indirect drivers of medical costs for rrDLBCL in Japan, as well as any intermediate effects, remain unclear.
In this study, structural equation modeling (SEM) was used to explore the relationship between patient characteristics, healthcare resource utilization (HCRU) and medical costs for rrDLBCL. Identifying drivers of medical cost may provide insights into how to reduce the economic burden for Japanese patients.

Study design and study population
An administrative retrospective claims database (2008-2020) provided by Medical Data Vision Co., Ltd. (MDV; Tokyo, Japan) was used in this study. Covering approximately 23% of acute hospitals and 30 million patients, the MDV database is a large database of anonymized medical claims from over 400 acute care hospitals in Japan.
The identified patients had at least one DLBCL-related treatment claim between October 1, 2008 and June 30, 2019. The first treatment date was defined as the date of first DLBCL-related treatment (1L) during this period with the appropriate International Classification of Disease 10th revision (ICD-10) diagnosis code (C83.3x, C85.2x or receiptcode 8847286). Records must have had a 6-month lookback period from index date with at least 1 claim (for any disorder) as used in a previous database study [11]. Minimum follow-up period for inclusion was 12 months and patient records that did not have at least 2 claims (1 claim every 6-month period for any disorder) were excluded in order to capture sufficient cost for this study to conduct SEM. Remaining patients were included in further analysis if they had received either 2L or 3L during the identification period. Two separate cohorts (with overlapping patients) were analyzed; one for patients who initiated 2L and one for patients who initiated 3L. Index date was defined as the first administration of second line for the 2L cohort and third line for the 3L cohort. Database was downloaded on 5th Oct 2020.

Patient characteristics
Patient demographics tabulated of which include gender, age, and age group. Clinical characteristics including year of index date, prior treatment regimen, potential complications due to treatment, duration of therapy (1L-3L), baseline Charlson Comorbidity Index (CCI) score Researchers wishing to access the data used in our study can find and access the specific dataset used in our study, in the same manner as the authors. We did not receive special privileges that others would not have, from the vendor or third-parties. It is not possible for us to upload the data for public access by means of a link. Therefore, we cannot provide a title or URL for the specific data used in this study. If more information regarding the specific data used in our study is needed, please contact BMSKK at the following address: BMSKK JP tower 2-7-2 Marunouchi Chiyoda-ku, Tokyo 100-0710, Japan; Tel: +81-3-5224-0600, Fax: +81-3-5224-0600.

PLOS ONE
with breakdowns of each comorbidities, including a modified index excluding diagnosis of DLBCL itself, were analyzed to describe the study cohorts. Potential complications from 2L/3L treatment, including heart disease [12], kidney disease [13], and liver disease [14], were defined as new events after index date among those without these conditions during any prior lines of therapy. Prior or concurrent cancers during the look-back period were also assessed (C00-C96 except for C77-89, i.e. exclude secondary neoplasms and lymphomas). The average duration of each line of therapy was calculated (1L-3L) as months from the first treatment date to the last treatment date records. CCI scores were calculated using the look-back period (prior to start of 2L/3L treatment) based on the ICD-10 codes associated with the modified CCI [15].
DLBCL-related treatment was summarized for drugs received within ±30 days of first line treatment initiation so to also include patients in the middle of their treatment cycle. Subsequent line of treatment for all included patients were extracted up to 5L+. Treatment lines were grouped in a hierarchical order based on their regimen components: DeVIC-based (dexamethasone, etoposide, ifosfamide, carboplatin) with or without rituximab, R-CHASEbased (rituximab, cyclophosphamide, cytarabine, etoposide, dexamethasone), GDP-based (gemcitabine, dexamethasone, cisplatin) with or without rituximab, R-Bendamustine-based, R-EPOCH-based (rituximab, etoposide, prednisone, vincristine, cyclophosphamide, doxorubicin), R-ESHAP-based (rituximab, etoposide, cytarabine, cisplatin, methylprednisolone), ESHAP-based, R-ICE-based (rituximab, ifosfamide, carboplatin, etoposide), R-DHAP-based (rituximab, dexamethasone, cytarabine, cisplatin), other R-based, and other chemotherapy without rituximab. Lastly, patients receiving conditioning regimens before auto-SCT were also extracted (including MINE, LEED, MCEC, MEAM followed by auto-SCT). Patients who received combination of rituximab and other immunotherapy were excluded as they generally were not indicated for treatment of rrDLBCL.
Patients were considered to be the same line of therapy if they were on the same regimen without a gap. Thus, a treatment regimen was considered a new line of therapy if the patient took a drug not included in their initial treatment regimen more than 30 days after treatment initiation date, or had a gap in treatment for >90 days (Fig 1). Patients who had a record of SCT (allogeneic (allo)-or auto-) were also considered part of the same line of therapy if the transplant occurred prior to a next line of therapy as described above. The approach for defining lines of therapy has also been previously described [16].

Healthcare resource utilization
Healthcare resources used during follow-up were assessed and included: number of patients receiving each line of therapy (i.e. 2, 3, 4, 5+), hospitalizations, ICU admissions, emergency room visits, any imaging (positron emission tomography (PET) scans, magnetic resonance imaging (MRI), computerized tomography (CT) scans), allogenic SCT (allo-SCT), auto-SCT, and radiation therapy. Mean (SD), median (Q1, Q3) and minimum and maximum values were calculated for continuous data and categorical data was calculated as the number of patients and proportion of the cohort.

Medical costs
Medical costs were the main outcomes of interest of the SEM. Total medical cost of care was calculated to include both DLBCL-related and DLBCL non-related costs, which occurred during each patient's follow-up (from the 2L/3L treatment). The components of total costs were: inpatient cost, intensive care unit (ICU) cost, outpatient cost, cancer treatment costs, and other pharmacy costs (for drugs prescribed other than cancer treatment). SCT costs, including any allo-SCT and auto-SCT. In addition to the SEM, all of these cost components were described as the number of patients, mean (SD), median (Q1, Q3), as well as minimum and maximum values.
Nominal direct medical costs were obtained in Japanese yen (JPY) directly from the database. Direct unadjusted (nominal) medical costs were presented after converting from JPY to USD using the exchange rate based on the first month of every year [17]. Direct unadjusted (nominal) medical costs were then adjusted to direct adjusted medical costs with regard to Japanese inflation rate based on the calendar year average of Consumer Price Index (reference year: 2020) [18].

Statistical analysis
The primary outcome for this study was the drivers of total medical cost. A SEM with path analysis was constructed to assess medical cost drivers as the associations with and between patient profile components (e.g. treatment regimen received, demographics, clinical conditions and HCRU) and total medical cost. The SEM is a measurement model used to define complex relationships between observed variables and their underlying concepts [19]. SEM includes two major components, a measurement model assessing confirmatory factor analysis and structural model for multiple regression/path analysis [20]. As the input parameters were not conceptual and defined from claims data, the model was constructed as path analyses, and due to the skew of medical cost and sample size under 5000, non-normality was accounted for with robust standard error [21,22].
All effects observed upon analysis with SEM are presented as direct, indirect and total effects for each cohort. The results of each effect category are presented as coefficients (B), standardize coefficients (β) with 95% confidence intervals (95%CI), and a two-sided test for significance (p-value). As the conventional of presentation of parameter estimates are the standardized coefficients its level of significance (p<0.05 or p<0.01) [23,24], threshold for all SEM coefficients was therefore set at 5%. The goodness of fit was tested for both SEMs using the standardized root mean square residual (SRMR), in which a value less than 0.08 is considered a well-fitted model [25]. The SEM pathway diagram showing the hypothesized relationships between variables is presented in Fig 2. Based on prior literature on covariates related to medical cost found in literature, patient clinical characteristics, treatment regimen, comorbidities, and complications were theorized to be direct effects on total healthcare cost in the SEM. Given the nature of the retrospective database, as medical cost is directly derived from an associated procedure or treatment, HCRU was also specified as a direct effect. Index treatment regimen was additionally specified as a mediator, as patient characteristics and comorbidities may also affect treatment regimen, and thus indirectly the medical cost. Similarly, complications and HCRU were also specified as mediators, as comorbidities and index regimen may indirectly impact total medical cost due to certain complications and high HCRU. Total effect for each predictor was subsequently calculated as the sum of the direct and indirect effects.
Direct effects.

Results
There were 4,208 patient records included in the 2L cohort and 1,702 patient records in the 3L cohort (Fig 3).

Patient profile of 2L cohort
Patient profiles for both cohorts are presented for several key characteristics in Nearly one third of patients had prior radiation therapy (32.7%), and a small proportion had prior SCT (13.5%). The mean [median] (SD) duration of 2L regimen was 3.7 [2.4] (5.5) months (S1 Table in S1 File). There were 20.3% and 22.0% of patients with congestive heart failure and chronic pulmonary disease, respectively. The proportion of the 2L group with baseline CCI score of 5 or greater decreased from 31.0% to 27.9% after removing DLBCL diagnosis from the calculation of the CCI score.

Patient profile of 3L cohort
In the 3L cohort, 55.8% were male and the mean [median] (SD) age was 67.7 [69.0] (12.4) for the entire population. A minority of patients were aged 71 years or above (45.1%). The index year of treatment for the 49.5% of these patients was on or after 2017. Mean [median] (SD) follow-up time was 820.6 [581.5] (650.0) days. A large minority of patients had prior radiation therapy (36.7%) or prior SCT (23.1%). The mean [median] (SD) duration of 2L in the 3L cohort was 2.6 [1.9] (3.1) months (S1 Table in S1 File). Comorbidities were identified in many 3L patient records, including 24.5% and 25.8% of patients with congestive heart failure and chronic pulmonary disease, respectively. Mild liver disease and metastatic solid tumors were also found in 27.4% and 18.2% of patients, respectively. Almost one third of patients had a CCI score of 5 or greater (31.5%) even after removing DLBCL diagnosis from the calculation. While proportions of patients receiving multiple lines of treatment differed slightly, the most common treatment categories for the 2L and 3L cohorts followed similar patterns ( Table 2). In both cohorts, gemcitabine, dexamethasone, cisplatin/carboplatin (GDP)-based with or without rituximab was the most common specific regimen across all treatment lines (range: 7.7%-9.0%). The largest minority of the 2L cohort (44.4%) of patients received other Rbased therapy in 2L. This proportion decreased to 21.6% by 5L for the 2L cohort and 22.7% for the 5L of the 3L cohort. During 3L for the 2L cohort, 24.9% received other R-based therapy and 28.1% received other chemotherapy without R. The 3L regimen for 3L cohort, in contrast, was comprised evenly of other R-based (32.7%) therapies or other chemotherapy without rituximab (30.4%). The proportion of patient receiving other R-based regimens switched to other chemotherapy without rituximab by 3L and steadily increased as patients progressed to through 5L. Very few patients received induction regimens prior auto-SCT, detailed in S2 Table in S1 File. HCRU for 2L and 3L cohorts was relatively similar with the exception of transplantation outcomes. admissions and emergency room visits were rare at less than 5% for both cohorts. It was notable that 13.3% of 2L cohort patients received an auto-SCT but 21.4% of 3L cohort patients received the same kind of transplantation during the follow-up period.
Medical costs for the cohorts were comparable (Table 3), with the 3L cohort having slightly higher total follow-up costs compared to 2L (by less than 2,000 USD

SEM outcomes
Estimates of total effects on medical cost, and its component indirect and direct effects, are presented for the 2L and 3L cohorts (Table 4)   Other prior or concurrent primary cancers besides DLBCL did not have a total effect on cost for either 2L or 3L cohort.

Discussion
Total medical cost during follow-up was relatively similar between 2L and 3L cohorts with average costs for 2L of 73,296 USD and 75,238 USD for 3L patients. The two treatment cohorts of rrDLBCL patients had similar baseline characteristics, HCRU, cost and cost drivers, except a few notable exceptions in terms of relative cost driver size. LOS and heart disease complications were consistently the largest drivers of medical costs was for both 2L and 3L cohorts. In 3L, the effect of LOS was about four times larger than heart disease complications and LOS was about three times larger than the effect of heart disease complications in 2L. The 2L cohort had about one third fewer auto-SCT than the 3L cohort and SCT was the third largest cost driver in the 3L cohort compared to R-CHASE regimen in the 2L cohort. These differences may reflect complex clinical decision-making about curative treatments based on the baseline characteristics of patients who have rrDLBCL refractory to more than one line of salvage chemotherapy in Japan. The biggest cost driver was LOS followed by heart disease complications for both cohorts. In 3L, the effect of LOS was about four times largest than heart disease complications and in 2L LOS was about three times larger than the effect of heart disease complications. This large effect of LOS is distinct from other studies using SEM to calculate effects on medical cost. For example, an SEM path analysis of respiratory syncytial virus in Japanese children found the effect of LOS on medical cost to be high but approximately 10 times lower than the effect of blood transfusions [27]. In the present study there were also SEM parameters for cohorts where direct effects were positive and the indirect effects were negative or vice versa. Indirect    methods section � Reference treatment group = other chemotherapy without rituximab † Includes only patients who underwent auto-SCT after regimen; patients who underwent induction therapies but did not undergo auto-SCT after the regimen were counted as "Other chemotherapy without R" ‡ Hu and Bentler,1999: SRMR of <0.08 represents a well-fitted model effects tended to be much larger than direct effects thus contributed more to the total effects thus underscoring the importance of realistic model design based on past literature to capture all pertinent effects. The intersection of cohort patient characteristics and their treatment patterns are one suggestion that there are differences in how more advanced rrDLBCL patients are treated in Japanese real-world practice. The 2L cohort was only slightly older in age than 3L cohort, however a large proportion of 3L patients had more comorbidities than 2L. SCT is considered to be the optimal treatment option for eligible patients with rrDLBCL [28], but within the follow-up period of the current analysis, the 2L cohort had about one third smaller proportion of auto-SCT than the 3L cohort. This may potentially be due to patients receiving their high-dose chemotherapy (HDC) more than 30 days after second line initiation or with a 90-day treatment gap, thus were counted as third line treatment and resulting in slightly higher proportion of SCTs counted in the 3L cohort. Due to the complexity of claims data and the high heterogeneity of salvage regimen drugs, HDC drugs, as well as timing of HDC, explicit separation between salvage chemotherapy and HDC was not further conducted. On the other hand, in spite of their comorbidities, the poorer prognosis of the 3L cohort may have required intensive therapy as conditioning for auto-SCT to further prolong survival. For example, a study of rrDLBCL patients in a single center in the UK found a considerable drop in complete response rates for rrDLBCL with 2L (27.0%), to 3L (17.5%), to 4L (2.4%) [29]. HCST also had one of largest effects on medical cost for both cohorts, though it was relatively higher in 3L. A study of Canadian patients similarly found that SCT had a larger impact on medical cost for patient's receiving more than one treatment DLBCL [30].
There were several protective factors for medical costs. Increasing age was associated with decreases in cost, mostly due to the shorter survival time (thus observation period) of older patients. Similarly, patients with later index years had a shorter observation period, thus index year was adjusted for in the model, but its coefficient should be interpreted with caution. Exploratory analysis of medical costs for each age group shows decreasing cost with age outside of the SEM, as well as decreasing follow-up time with age. The total effects from SEM results showed that females had significantly less cost burden. Outside of the SEM, females also had lower costs with comparable follow-up time.
The real world treatment patterns used to treat rrDLBCL in Japan are diverse and have different impact on overall medical cost. This treatment has been shown to have some efficacy in rrDLBCL in a phase II study (overall response rate 67%) [31] but this treatment has not been studied in detail from an economic perspective [28].
This study poses a few limitations. First, due to the nature of retrospective claims studies, patients cannot be traced longitudinally and each exact line of therapy assigned may be subject to bias. Additionally, medical costs accrued outside of the facilities captured by the database are not accounted for, which may contribute to an underestimation of the total medical costs. Lastly, due to the complex paths used and the large number of predictors, statistical significance should be interpreted with caution and should be interpreted holistically.
This study is the first in Japan to investigate the relationship between patient attributes, healthcare utilization, and total medical cost in rrDLBCL patients. Our study positioned a holistic model of the predictors of medical drivers in a complex disease with poor prognosis. The findings suggest that although age and gender have direct impact on total cost in both 2L and 3L, complications and treatment regimen also impact total cost, largely through indirect effects.